智能论文笔记

Chronic Pain and Language: A Topic Modelling Approach to Personal Pain Descriptions

Diogo A. P. Nunes , David Martins de Matos , Joana Ferreira Gomes , Fani Neto

分类：自然语言处理

2021-09-01

慢性疼痛被认为是一个重大的健康问题，不仅受到经济，而且在社会和个人层面的影响。作为私人和主观的经验，它不可能从外部和公正地体验，描述和解释慢性疼痛，作为纯粹的有害刺激，直接指向因果症，并促进其缓解，与急性疼痛相反，对其进行评估通常是直截了当的。因此，口头沟通是将相关信息传达给卫生专业人员的关键，否则外部实体无法访问，即关于痛苦经验和患者的内在质量。我们提出并讨论了一个主题建模方法，以识别慢性疼痛的口头描述中的模式，并使用这些模式量化和限定疼痛的经验。我们的方法允许提取关于所获得的主题模型和潜在空间的慢性疼痛经验的新洞察。我们认为我们的结果在临床上与慢性疼痛的评估和管理有关。

translated by 谷歌翻译

Analysis of Chronic Pain Experiences Based on Online Reports: the RRCP Dataset for quality-of-life assessment

Diogo A. P. Nunes , David Martins de Matos , Fani Neto , Joana Ferreira Gomes

分类：自然语言处理

2021-08-23

目的：验证自然语言处理（NLP）技术的适用性，透露和量化，通过慢性疼痛（RRCP）数据集的新型Reddit报告，致力于慢性疼痛（RRCP）DataSet的报告，旨在成为未来研究的标准在这个欠发达地区。方法：定义和验证与慢性疼痛有关的一组病理学的RRCP数据集。对于每种病理学，确定慢性疼痛经历的主要品质。比较每种病理学的确定质量并验证临床研究。结果：RRCP数据集包含来自与慢性疼痛相关的12个底板的136,573 reddit提交。宏观分析表明，影响相同或相似的身体部位的病理结果导致语义上的疼痛描述。详细的分析表明，在给定的病理学中，存在慢性疼痛的素质，这些病理学的慢性疼痛是从另一个病理学中经历它，以及一些慢性疼痛的各种经验都是共同的。这些使我们能够比较慢性疼痛的主观经验（例如，对于RRCP人群，体验关节炎与在各种质量或疑虑中经历紧张的脊柱炎，同时经历纤维肌痛而包括相同的品质和其他两个病态的特质）。结论：我们对慢性疼痛描述的无监督语义分析反映了关于不同病理在慢性疼痛体验方面如何显现的临床知识。我们的结果验证了使用NLP技术从慢性疼痛经验的描述中自动提取和量化临床相关信息。

translated by 谷歌翻译

Using attention methods to predict judicial outcomes

Vithor Gomes Ferreira Bertalan , Evandro Eduardo Seron Ruiz

分类：机器学习 | 人工智能 | 自然语言处理

2022-07-18

法律判决预测是NLP，AI和法律联合领域最受欢迎的领域之一。通过法律预测，我们是指能够预测特定司法特征的智能系统，例如司法结果，司法阶级，可以预测特定案例。在这项研究中，我们使用AI分类器来预测巴西法律体系中的司法结果。为此，我们开发了一个文本爬网，以从巴西官方电子法律系统中提取数据。这些文本构成了二级谋杀和主动腐败案件的数据集。我们应用了不同的分类器，例如支持向量机和神经网络，通过分析数据集中的文本功能来预测司法结果。我们的研究表明，回归树，封闭的重复单元和分层注意力网络给出了不同子集的较高指标。作为最终目标，我们探讨了一种算法的权重，即分层注意力网络，以找到用于免除或定罪被告的最重要词的样本。

translated by 谷歌翻译

Panoptic Segmentation Meets Remote Sensing

Osmar Luiz Ferreira de Carvalho , Osmar Abílio de Carvalho Júnior , Cristiano Rosa e Silva , Anesmar Olino de Albuquerque , Nickolas Castro Santana , Dibio Leandro Borges , Roberto Arnaldo Trancoso Gomes , Renato Fontes Guimarães

分类：计算机视觉 | 人工智能

2021-11-23

Panoptic semonation组合实例和语义预测，允许同时检测“事物”和“东西”。在许多具有挑战性的问题中有效地接近远程感测的数据中的Panoptic分段可能是吉祥的，因为它允许连续映射和特定的目标计数。有几个困难阻止了遥感中这项任务的增长：（a）大多数算法都设计用于传统图像，（b）图像标签必须包含“事物”和“填写”类，并且（c）注释格式复杂。因此，旨在解决和提高遥感中Panoptic分割的可操作性，这项研究有五个目标：（1）创建一个新的Panoptic分段数据准备管道，（2）提出注释转换软件以产生Panoptic注释; （3）在城市地区提出一个小说数据集，（4）修改任务的Detectron2，（5）评估城市环境中这项任务的困难。我们使用的空中图像，考虑14级，使用0,24米的空间分辨率。我们的管道考虑了三个图像输入，所提出的软件使用点Shapefile来创建Coco格式的样本。我们的研究生成了3,400个样本，具有512x512像素尺寸。我们使用了带有两个骨干板（Reset-50和Reset-101）的Panoptic-FPN，以及模型评估被视为语义实例和Panoptic指标。我们获得了93.9,47.7和64.9的平均iou，box ap和pq。我们的研究提出了一个用于Panoptic Seation的第一个有效管道，以及用于其他研究人员的广泛数据库使用和处理需要彻底了解的其他数据或相关问题。

translated by 谷歌翻译

Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning for Generating a City-Scale Vehicle Dataset

Osmar Luiz Ferreira de Carvalho , Osmar Abílio de Carvalho Júnior , Anesmar Olino de Albuquerque , Nickolas Castro Santana , Dibio Leandro Borges , Roberto Arnaldo Trancoso Gomes , Renato Fontes Guimarães

分类：计算机视觉 | 人工智能

2021-11-23

车辆分类是一台热电电脑视觉主题，研究从地面查看到顶视图。在遥感中，顶视图的使用允许了解城市模式，车辆集中，交通管理等。但是，在瞄准像素方面的分类时存在一些困难：（a）大多数车辆分类研究使用对象检测方法，并且最公开的数据集设计用于此任务，（b）创建实例分段数据集是费力的，并且（C ）传统的实例分段方法由于对象很小，因此在此任务上执行此任务。因此，本研究目标是：（1）提出使用GIS软件的新型半监督迭代学习方法，（2）提出一种自由盒实例分割方法，（3）提供城市规模的车辆数据集。考虑的迭代学习程序：（1）标记少数车辆，（2）在这些样本上列车，（3）使用模型对整个图像进行分类，（4）将图像预测转换为多边形shapefile，（5 ）纠正有错误的一些区域，并将其包含在培训数据中，（6）重复，直到结果令人满意。为了单独的情况，我们考虑了车辆内部和车辆边界，DL模型是U-Net，具有高效网络B7骨架。当移除边框时，车辆内部变为隔离，允许唯一的对象识别。要恢复已删除的1像素边框，我们提出了一种扩展每个预测的简单方法。结果显示与掩模-RCNN（IOU中67％的82％）相比的更好的像素 - 明智的指标。关于每个对象分析，整体准确性，精度和召回大于90％。该管道适用于任何遥感目标，对分段和生成数据集非常有效。

translated by 谷歌翻译

Anxolotl, an Anxiety Companion App -- Stress Detection

Nuno Gomes , Matilde Pato , Pedro Santos , André Lourenço , Lourenço Rodrigues

分类：机器学习

2022-12-28

Stress has a great effect on people's lives that can not be understated. While it can be good, since it helps humans to adapt to new and different situations, it can also be harmful when not dealt with properly, leading to chronic stress. The objective of this paper is developing a stress monitoring solution, that can be used in real life, while being able to tackle this challenge in a positive way. The SMILE data set was provided to team Anxolotl, and all it was needed was to develop a robust model. We developed a supervised learning model for classification in Python, presenting the final result of 64.1% in accuracy and a f1-score of 54.96%. The resulting solution stood the robustness test, presenting low variation between runs, which was a major point for it's possible integration in the Anxolotl app in the future.

translated by 谷歌翻译

A comprehensive analysis of the Elo rating algorithm: Stochastic model, convergence characteristics, design guidelines, and experimental results

Daniel Gomes de Pinho Zanco , Leszek Szczecinski , Eduardo Vinicius Kuhn , Rui Seara

分类：机器学习 | 人工智能

2022-12-22

The Elo algorithm, due to its simplicity, is widely used for rating in sports competitions as well as in other applications where the rating/ranking is a useful tool for predicting future results. However, despite its widespread use, a detailed understanding of the convergence properties of the Elo algorithm is still lacking. Aiming to fill this gap, this paper presents a comprehensive (stochastic) analysis of the Elo algorithm, considering round-robin (one-on-one) competitions. Specifically, analytical expressions are derived characterizing the behavior/evolution of the skills and of important performance metrics. Then, taking into account the relationship between the behavior of the algorithm and the step-size value, which is a hyperparameter that can be controlled, some design guidelines as well as discussions about the performance of the algorithm are provided. To illustrate the applicability of the theoretical findings, experimental results are shown, corroborating the very good match between analytical predictions and those obtained from the algorithm using real-world data (from the Italian SuperLega, Volleyball League).

translated by 谷歌翻译

Out-Of-Distribution Detection Is Not All You Need

Joris Guérin , Kevin Delmas , Raul Sena Ferreira , Jérémie Guiochet

分类：机器学习 | 人工智能 | 计算机视觉

2022-11-29

The usage of deep neural networks in safety-critical systems is limited by our ability to guarantee their correct behavior. Runtime monitors are components aiming to identify unsafe predictions and discard them before they can lead to catastrophic consequences. Several recent works on runtime monitoring have focused on out-of-distribution (OOD) detection, i.e., identifying inputs that are different from the training data. In this work, we argue that OOD detection is not a well-suited framework to design efficient runtime monitors and that it is more relevant to evaluate monitors based on their ability to discard incorrect predictions. We call this setting out-ofmodel-scope detection and discuss the conceptual differences with OOD. We also conduct extensive experiments on popular datasets from the literature to show that studying monitors in the OOD setting can be misleading: 1. very good OOD results can give a false impression of safety, 2. comparison under the OOD setting does not allow identifying the best monitor to detect errors. Finally, we also show that removing erroneous training data samples helps to train better monitors.

translated by 谷歌翻译

Unimodal and Multimodal Representation Training for Relation Extraction

Ciaran Cooney , Rachel Heyburn , Liam Maddigan , Mairead O'Cuinn , Chloe Thompson , Joana Cavadas

分类：自然语言处理

2022-11-11

Multimodal integration of text, layout and visual information has achieved SOTA results in visually rich document understanding (VrDU) tasks, including relation extraction (RE). However, despite its importance, evaluation of the relative predictive capacity of these modalities is less prevalent. Here, we demonstrate the value of shared representations for RE tasks by conducting experiments in which each data type is iteratively excluded during training. In addition, text and layout data are evaluated in isolation. While a bimodal text and layout approach performs best (F1=0.684), we show that text is the most important single predictor of entity relations. Additionally, layout geometry is highly predictive and may even be a feasible unimodal approach. Despite being less effective, we highlight circumstances where visual information can bolster performance. In total, our results demonstrate the efficacy of training joint representations for RE.

translated by 谷歌翻译

Toward Human-AI Co-creation to Accelerate Material Discovery

Dmitry Zubarev , Carlos Raoni Mendes , Emilio Vital Brazil , Renato Cerqueira , Kristin Schmidt , Vinicius Segura , Juliana Jansen Ferreira , Dan Sanders

分类：机器学习 | 人工智能

2022-11-05

There is an increasing need in our society to achieve faster advances in Science to tackle urgent problems, such as climate changes, environmental hazards, sustainable energy systems, pandemics, among others. In certain domains like chemistry, scientific discovery carries the extra burden of assessing risks of the proposed novel solutions before moving to the experimental stage. Despite several recent advances in Machine Learning and AI to address some of these challenges, there is still a gap in technologies to support end-to-end discovery applications, integrating the myriad of available technologies into a coherent, orchestrated, yet flexible discovery process. Such applications need to handle complex knowledge management at scale, enabling knowledge consumption and production in a timely and efficient way for subject matter experts (SMEs). Furthermore, the discovery of novel functional materials strongly relies on the development of exploration strategies in the chemical space. For instance, generative models have gained attention within the scientific community due to their ability to generate enormous volumes of novel molecules across material domains. These models exhibit extreme creativity that often translates in low viability of the generated candidates. In this work, we propose a workbench framework that aims at enabling the human-AI co-creation to reduce the time until the first discovery and the opportunity costs involved. This framework relies on a knowledge base with domain and process knowledge, and user-interaction components to acquire knowledge and advise the SMEs. Currently,the framework supports four main activities: generative modeling, dataset triage, molecule adjudication, and risk assessment.

translated by 谷歌翻译